Database Learning for Software Agents
نویسندگان
چکیده
With the amount of information available rapidly outstripping the ability of individuals to use it, we wish to explore how a software agent can learn a description of an information resource (such as a database on the internet) in order turn it into a well-understood tool at the agent’s disposal. An agent who could do this would have access to all the information it could find without having to cache the internet. As the agent makes queries to an information resource, it will generalize from those queries and generate hypotheses about the structure and content of the database. We therefore formulate this problem as a learning problem in which the input is (1) the agent’s model its representation of the world; and (2) a series of queries to and responses from a database. The output is a mapping from fields in the information resource to predicates in the model. Our approach to this learning problem relies on overlap between the agent’s model and the information in the database. The agent will use its own knowledge to form hypotheses about the structure of the records. We have developed the correspondence heuristic, which states that a correspondence of tokens between the agent’s world model and the information resource indicates a correspondence between types. The agent matches the values of the fields in the database against facts in its model. The relationships that hold among these facts in the model are assumed to correspond to relationships in the database. Suppose that the agent makes a query to staffdir, the UW personnel directory, and gets back “Oren Etzioni 206”. The agent would have facts in its model like (lastname person37 Etzioni) and (office person37 206). From this query and this knowledge, the agent could conclude that the second field of the output is lastname and the third field is office. Our work has many similarities to structuremapping work (Falkenhainer, Forbus, & Gentner 1986). Both approaches rely on discovering correspondences between separate domains. Structuremapping, however, seeks correspondence between underlying structure, while the correspondence heuristic relates tokens in order to make inferences about the structure. The correspondence heuristic is an inductive bias which can be formalized as a determination: V(G Y)Pw A T(Y) A (W = w4> S(Y) = WY)1 T is a type predicate such as “on the UW faculty”. S is a syntactic predicate like “the first field in the output of staffdir x”. M is a semantic predicate (i.e. from the agent’s model) such as “the first name of x”. This formalization clearly indicates three areas for work. Learning T could be handled by standard inductive learning algorithms. We assume a syntactic model of ordered fields to account for S. Future work may pursue other kinds of syntax, such as keywordbased syntax. The focus of our work is learning the appropriate M predicate. In particular, we have been exploring the problem of Predicate mismatch, which occurs when instances of one type in the database are instances of a different type in the model, or when relations in the database do not correspond to primitive relations in the model. For example, imagine that the agent gets back “Oren Etzioni FR-35” from a query. FR-35 is actually the mail stop of Etzioni’s department and so there is no fact to link the person Etzioni to the string FR-35 directly. Instead, the agent must realize that the entry in the database corresponds to a chain of predicates in its model linking Etzioni to Computer Science and Computer Science to FR-35. We have devised a way of doing this using a method reminiscent of spreading activation, in which a link between two tokens is found by exploring outward from the tokens until an intersection is found. Given simplifying assumptions about the syntax, our implemented algorithm has learned staff dir as well as 1s and finger (UNIX commands with tabular output can be treated as query/response databases). In the future, we will extend this to be able to handle information resources found on the World Wide Web by programs that traverse the web automatically.
منابع مشابه
Quantification of Learner Characteristics for Collaborative Agent based e-Learning Automation
Software agents of educational automation procedures consist of pieces of software that deal with human characteristics, so as to facilitate efficiency and increased effectiveness in social learning. When software agents are integrated with learning processes (in adaptive e-learning environments) then they are known as interface agents. When learning processes are integrated with learner requir...
متن کاملAgent-Based Knowledge Discovery
Agent-Based Knowledge Discovery provides a new technique for performing data-mining over distributed databases. By combining techniques from Distributed AI and Machine Learning, software agents equipped with learning algorithms mine local databases. These agents then co-operate to integrate the knowledge obtained, before presenting the results to the user. We are currently exploring the use of ...
متن کاملText database cleaning by filling the Missing values using Object Oriented Intelligent Multi - Agent System Data Cleaning Architecture
Agents are software programs that perform tasks on behalf of others and they are used to clean the text database with their characteristics. Agents are task oriented with the ability to learn by themselves and they react to the situation. Learning characteristics of an agent is done by verifying its previous experience from its knowledgebase. An agent concept is a complementary approach to the ...
متن کاملAssisting Learners to Dynamically Adjust Learning Processes through Software Agents
To make online learning more productive, software agent technology has been applied to provide services for learners in order to assist them to construct knowledge in constructivist ways. This paper is focused on the application of software agents in assisting learners to dynamically adjust learning processes. Unlike pedagogical agents, the agents in this application do not hold domain knowledg...
متن کاملOntoAgent: A platform for the declarative specification of agents
This paper presents a detailed description of our OntoAgent framework. It allows a software agent to be specified entirely using standard mark-up languages from the Semantic Web community, namely RDF, RDF Schema, and RuleML [2]. The basic agent components are identified and their implementation using relational databases and Java technology is described. The agents communicate via standard Inte...
متن کاملHierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents
This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994